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ABSTRACT 

In reverse genetics, a gene's function is elucidated 
through targeted modifications in the coding region 
or associated DNA c/s-regulatory elements. To this 
purpose, recently developed customizable tran- 
scription activator-like effector nucleases (TALENs) 
have proven an invaluable tool, allowing introduc- 
tion of double-strand breaks at predetermined 
sites in the genome. Here we describe a practical 
and efficient method for the targeted genome engin- 
eering in Drosophila. We demonstrate TALEN- 
mediated targeted gene integration and efficient 
identification of mutant flies using a traceable 
marker phenotype. Furthermore, we developed an 
easy TALEN assembly (easyT) method relying on 
simultaneous reactions of DNA Bae I digestion and 
ligation, enabling construction of complete TALENs 
from a monomer unit library in a single day. Taken 
together, our strategy with easyT and TALEN- 
plasmid microinjection simplifies mutant generation 
and enables isolation of desired mutant fly lines in 
the F-i generation. 



INTRODUCTION 

The generation of mutant organisms with defined genomic 
modifications is the most efficient way in reverse genetics 
to elucidate the functions of gene products or unravel gene 
regulatory mechanisms. This has been successfully ex- 
ploited in yeast owing to the ability to exchange DNA 
via homologous recombination (HR) between endogenous 
target locus and an exogenously introduced DNA having 
desired genetic alteration (1,2). In the cases of multicellu- 
lar organisms, however, such targeted HR is rarely 
induced, when homologous foreign DNAs is provided, 



with a few exceptions including mouse embryonic stem 
cells or the moss Physcomitrella patens (3,4). 

A breakthrough in targeted genome manipulation was 
achieved by the development of the engineered nucleases: 
zinc-finger nucleases (ZFNs) (5) and, more recently, tran- 
scription activator-like effector nucleases (TALENs) (6), 
which consist of repeats of DNA-binding domains and a 
Fok I nuclease domain. Because Fok I is only active as a 
dimer (7), these enzymes can cut genomic DNA only 
where a pair of two engineered nucleases binds nearby 
in a head-to-head manner. In eukaryotic cells, double- 
strand breaks (DSBs) stimulate intrinsic DNA repair 
mechanisms, either error-prone non-homologous end- 
joining (NHEJ) or error-free HR. In addition to unspecific 
insertions and deletions mediated by NHEJ, targeted in- 
tegration of exogenously provided DNA sequences by HR 
is of substantial value for genome engineering. 

TALENs are lately getting more attention over ZFNs 
because of their unique and simple modularity of DNA- 
binding domains conferring target specificity (8). Essential 
to the sequence-specific DNA recognition by TALENs is 
the contiguous repeat of DNA-binding modules of 
transcription activator-like effectors (TALE-repeats), 
which varies in number from 15.5 to 19.5 repeats among 
most of naturally occurring TALEs (9). Each module 
consists of 34 amino acids recognizing a single base on 
DNA. The base preference of each module is specified 
by the two residues at 12th and 13th positions, called 
repeat variable di-residues (RVDs), and a set of RVD- 
base codes has been deciphered (10-13). Because the 
modules that preferentially bind to a unique base have 
been identified (e.g. Nl-adenine, NG-thymine, NK- 
guanine and HD-cytosine), TALENs with desired 
sequence specificity can be constructed by arranging 
these distinctive modules in a specific order. There is, 
however, a constraint in selection of TALEN target 
sites. A thymine is the base that precedes the sequences 
bound by TALE-repeats. The thymine, often called T at 
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position 0, is essential for the targeting of designer TALEs 
or TALENs and recognized by two degenerate modules at 
the N-terminus (14,15). Nonetheless, the customizability 
of TALENs for the design and construction of tailored 
molecular scissors to target any sequence of interest is 
simple and straightforward. 

The utility of TALENs for targeted germ line mutagen- 
esis has been successfully proven in a wide range of model 
organisms, including nematodes (16), insects (17-21), 
zebrafish (22-26), Xenopus (27) and some mammals 
(28,29). In the fruit fly Drosophila melanogaster, based 
on studies with ZFNs (30), the microinjection of 
TALEN mRNA into the syncytial embryo has been 
used to generate targeted mutations in the germ line 
(18,21). However, the efficiency of TALEN-mediated 
targeted mutagenesis in Drosophila is generally lower 
than in other model organisms such as zebrafish or 
Xenopus, where TALEN mRNAs are delivered at a 
single-cell stage. Therefore, identification of mutant 
alleles without a visible phenotype is laborious and 
requires interrogation of the genomic TALEN target 
loci. The use of genetic markers has made identification 
of transgenic mutants and tracking of the transgenes an 
uncomplicated matter. However, to date, no efforts have 
been made to use a traceable marker as an indicator of 
TALEN-induced mutations. TALEN-mediated direct in- 
frame introduction of fluorescence genes into endogenous 
genes has been attempted in cultured cells (31,32) and 
zebrafish (26). Unfortunately, the utility of these selection 
markers depends on the expression levels of the fusion 
proteins. 

Here we develop a simple and efficient method of 
assembling TALENs into an expression vector that can 
subsequently be used for Drosophila embryo microinjec- 
tions. We demonstrate that we can induce targeted 
locus-specific mutations via NHEJ as well as marker 
gene integration via HR in the Drosophila genome. This 
traceable marker allows efficient identification of the 
engineered flies. 

MATERIALS AND METHODS 

Construction of TALEN backbone plasmid 

The TALEN backbone created in this work is a modifica- 
tion of the AvrXa7-FN from B. Yang's lab (33). Owing to 
a high homology and related origin of AvrXa7 and 
AvrXalO proteins, the truncation produced in this study 
was modelled after the AvrXalO truncation of Sun et al. 
(34). The N3-C1 truncation produced in this study 
retained 207 amino acids on the N-terminus and 63 
amino acids on the C-terminus (Supplementary Figure 
SI). The nuclear localization signal (NLS) from SV40 
T-antigen was added to the N-terminus. The CAT and 
ccdB selection markers were inserted between N3 and 
CI fragments with Mlu I and Aat II. To allow insertion 
of a tailored TALE-repeat in easy TALEN assembly 
(easyT) protocol, Bae I sites were created on either side 
of the CAT-ccdB cassette. The TALEN.N3-C1 backbone 
was placed under Copia promoter for subsequent expres- 
sion in Drosophila embryos. 



Construction of unit template plasmids 

The template DNA sequences for individual monomer 
units were created by dimerizing each pair of DNA- 
oligos shown in Supplementary Table SI, and cloned 
into pGEM-T Easy vector (Promega). The RVDs used 
were NI for A, NG for T, NK for G and HD for C. 
Because polymerase chain reaction (PCR) amplification 
of repetitive sequences is inefficient and 4-mer repeats 
amplification is required in the easyT protocol, four 
types of the template DNA sequences (type a, b, c and 
d) were created for unit 2 to 19 without changing the 
amino acid sequences (Supplementary Figure S2). 

TALEN construction 

The TALEN-y pair targeting y (chrX: 253 627-263 693) 
was constructed with a modular assembly protocol (33), 
and subsequently cloned into the TALEN N3-C1 
backbone. TALEN-w pair (chrX: 2 686 269-2 686 214), 
TALEN-upd pair (chrX: 18 207 553-18 207 606) and 
TALEN-wg pair (chr2L: 7 325 600-7 325 658) were 
created with the easyT protocol. See Supplementary 
Methods for step-by-step easyT protocols and backup 
troubleshooting protocols. The TALE-repeats and target 
DNA sequences for each TALEN pair were shown in 
Table 1. 

Construction of Donor Plasmids 

The right and left homology arms of donor DNA were 
separately amplified by PCR from a wild-type fly genome 
as a template. Both arms and 3xP3-EGFP-SV40 cassette 
(35) were cloned into pGEM-T Easy vector. The regions 
used for homologous donor sequences were as follows: 
Donor-j, chrX: 251 198-255 994; Donor-upd, chrX: 
18 205 583-18 209 591; and Donor-wg, chr2L: 7 323 606- 
7 327 640. On Donor-upd and Donor-wg, the putative 
activator protein- 1 (AP-l)-binding sequences TGA[C/ 
G]TCA at TALEN-upd or TALEN- wg target sites were 
deleted for further study (TK, unpublished data). The 
primers used are available on request. 

Drosophila embryo microinjection 

The concentrations of TALEN and donor plasmids used 
were shown in Tables 2 and 3. The syncytial blastodermal 
embryos of wild type or w 1118 Ug4 169 (36) were used for 
microinjection. To detect TALEN-induced germ line y or 
w mutants, the eclosed flies (F 0 ) were individually crossed 
with lesion-known y 1 w 1 or w 1118 mutants, respectively. 
Because the levels of 3xP3-EGFP expression are 
influenced by the chromosomal location, the identification 
of the mutant (Fi) expressing EGFP in its eyes was per- 
formed in w 1118 background. Frequencies of targeted mu- 
tagenesis or gene integration were calculated as the 
frequency of yielders: the number of F 0 yielding mutant 
offspring in ¥\ per the total number of fertile F 0 . 

Validation of targeted mutagenesis 

Genomic DNA of TALEN-mediated mutants were 
isolated from single fly with QIAamp DNA Micro Kit 
(QIAGEN). The TALEN-j and TALEN-w target sites 



Page 3 of 9 Nucleic Acids Research, 2013, Vol. 41, No. 17 el63 



Table 1. TALENs constructed in this work 



TALEN 


RVDs and target sequence 




Repeats 


Assembly 












TAT FlM-i;_T 


TYTT TYTT K\C K\C Hn MP un Hn K\C TYTT MP K\C MP K\C TYTT K\C K\C K\C K\C K\C K\C TYTT 
1YI ± 1YI 1. IYILj tlL) viKi tlL) tlL) riKj 1YI ± IYILj IMVj IYHj 1YI ± IYI^j IMVj IYILj IMVj 1YI ± 


TYT'K' T\T"PT 
IN XV IN XV 


23.5 


1\/T /l 1 1 1 Q f QCCP1T1 nl\/ I A 

IVHJU-Ulctl dbbClilUiy yjJJ 




aattctcctatt ttatttttta 


cr G 
y vj 






TAT FN-i;-R 


un T\TT "NTT TYTT Hn MP TYTPT Hn TYTK" TYT'K' 1\TP T-T"n HFI TYTT TYTP T\T"PT TYTP 1\TP TVTP "NTT "KTP TYTT 
rlU l\l _L l\l _L l\l _L JT1U INvjj 1NJTV JTIU IN XV INxV INvj JT1U JT1U IN _L INvj INrV L\\j INvj INvj IN _L INvj INI J. 


NG NI 


23.5 






CAAACTG C G GT C CATG T TTATA 


T A 






TAT FN-w-T 


un i\[Tr TYT'K' un Hn 1\TT 1\TP H"n 1\TT T\T"PT 1\TT 1\TT T\T"PT T\T"PT 1\TT 1\TP T-TT) 1\TP TNTP 
rlU IN XV IN XV XliJ XliJ IN _L INvj XliJ IN _L IN XV IN _L IN _L IN XV IN XV IN _L INvj JT1U INvjj INvj 




18.5 






CGG C C AT CAGAAGGAT CTT 








TALEN-w-R 


NG HD NI NG HD NI NK HD HD NK NG HD NG NG HD HD NK NI NK 




18.5 


easyT (this study) 




TCAT CAGCCGTCTTCCGAG 








TALEN-w/?<i-L 


NK HD HD HD NK NI HD NI NG NG NG NG NK HD HD 




14.5 


easyT (this study) 




gcccgacat tt tgcc 








TALEN-w/?d-R 


NI HD HD NK NG NI HD NI NK HD HD NK NG HD NG 




14.5 


easyT (this study) 




accgtacagccgtc t 








TALEN-wg-L 


HD NI NG HD NG NK NI NG NK HD NG NG HD NI HD NI NK NI NI 




18.5 


easyT (this study) 




catc tgatgct tcacagaa 








TALEN- wg-R 


NK NI NI NG NG HD HD NG NK NI NI NI HD NG NK NI NI NG HD 




18.5 


easyT (this study) 




gaat tcctgaaactgaatc 








In each TALEN target sequence, protein coding sequences are shown in upper cases, and intron or interg 


enic region are written in lower cases. 


Table 2. TALEN-mediated targeted gene mutagenesis 








Target 


Recipient Concentration Frequency of targeted gene integration a 














(ng/jil each) 








Gene 


Chr. Male 




Female 


Total 


y 


X Wild type 25 27.6% (8/29) 




8.3% (1/12) 


22.0% (9/41) 




50 43.2% (19/44) 




11.1% (5/45) 


27.0% (24/89) 




100 22.4% (11/49) 




6.8% (3/44) 


15.1% (14/93) 




250 0.0% (0/39) 




0.0% (0/42) 


0.0% (0/81) 


w 


X Wild type 50 12.5% (3/24) 




0.0% (0/14) 


7.9% (3/38) 




100 0.0% (0/68) 




0.0% (0/50) 


0.0% (0/118) 



a The frequencies were calculated as frequency of yielder. 



Table 3. TALEN-mediated targeted gene integration 



Target 


Homology length 
of donor DNA (kb) 


Concentration (ng/fil) 


Frequency of targeted gene inte^ 


*ration a 




Gene Chr. 


TALEN 


Donor 


Male 


Female 


Total 




y X 


5.0 


50 


100 


1.6% (1/63) 


1.5% (1/67) 


1.5% 


(2/130) 


upd X 


4.0 


50 


500 


8.3% (9/108) 


1.6% (3/189) 


4.0% 


(12/297) 




4.0 


50 


100 


14.3% (11/77) 


6.7% (5/75) 


10.5% 


(16/152) 


wg II 


4.0 


50 


100 


2.6% (3/114) 


1.4% (1/72) 


2.2% 


(4/186) 



a The frequencies were calculated as frequency of yielder. 



at y or w loci were amplified and sequenced. TALEN- 
mediated targeted 3xP3-EGFP integrations were also 
determined by PCR amplification and sequencing. 
Primer pairs used in genomic DNA amplification are 
shown in Supplementary Table 3. 

In silico estimation of TALEN specificity as a function 
TALEN-binding sequence length 

To assess the TALEN target specificity, we used a custom- 
made R script using different bioconductor packages (37) 
and multicore (parallel processing of R code on machines 
with multiple cores or CPUs, Simon Urbanek, version 



0.1-7). To estimate target specificity of a TALEN-L/R 
pair of a given TALE-repeat length, 500 TALEN target 
sites, which consist of TALEN-L- and -R-binding se- 
quences separated by a spacer sequence ranging from 12 
to 32 bases (randomly assigned in each sample), were 
sampled uniformly from D. melanogaster genome. Here, 
TALE-repeat lengths ranging from 5 to 20 were con- 
sidered. In total, 500 samples were independently 
generated from each of the following chromosomal 
arms: chr2L, chr2R, chr3L, chr3R, chrX, chr2LHet, 
chr2RHet, chr3LHet, chr3RHet and chrXHet. 
The TALEN-L- and -R-binding sequences were subse- 
quently used to generate a set of sequences representing 
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all potential targeting sites. For this purpose, the TALEN- 
L- and -R-binding sequences are joined to either end of 
spacer sequences of Ns, which vary in length from 12 to 
32; [TALEN-L-binding sequence] + [spacer: Ni 2 -32] + 
[TALEN- R-binding sequence]. In addition, and to 
account for the targeting sites caused by TALEN-L 
and TALEN-R self-pairing, [TALEN-L-binding se- 
quence] + [spacer: N 12 _32] + [TALEN-L-binding sequence] 
and [TALEN- R-binding sequence] + [spacer: N 12 _32] + 
[TALEN-R-binding sequence] were generated. All se- 
quences (63 in total) were aligned to the D. melanogaster 
reference genome (BDGP Release 5 dm3) requiring a 
perfect match for both TALEN-binding sequences while 
ignoring the spacer sequence of Ns. A TALEN target site 
was considered to be specific if only the initial sequence 
uniquely aligns to its origin and all other constructed se- 
quences fail to align. Hence, sequence sets exhibiting more 
than one alignment are considered to be unspecific. Using 
all 500 samples, the TALEN target specificity for a given 
length and chromosomal arm was calculated as the 
number of specific TALENs divided by the number of 
samples. To assess the contribution of genomic repeat 
regions to unspecific TALEN-binding sites, we subtracted 
all samples derived from repeat regions of the genome. 
Repeat regions were identified by Tandem Repeats 
Finder and RepeatMasker masks, including, for 
example, simple repeats and transposable elements, con- 
tained in the D. melanogaster object provided by the 
BSgenome bioconductor package. 



A 500 samples/ chromosomal arm 
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RESULTS 

The length of TALE-repeats required for specific 
targeting in Drosophila genome 

TALENs with desired sequence specificities are con- 
structed by arranging the order of TALE-repeat 
modules. To gain the target specificity, TALE-repeats 
have to contain a certain number of modules. 
Additionally, highly site-specific DSB introduced by a 
TALEN pair is in part supported by the requirement 
that Fok I nuclease domain acts as dimmer (7). The 
genome of D. melanogaster is ~ 180 Mb in length, consist- 
ing of ~ 120 Mb of euchromatic sequences and 60 Mb of 
heterochromatic regions (38). Thus, to determine the 
length of the TALE-repeats required for minimizing un- 
favourable off-target DNA cleavages in D. melanogaster 
genome, we computationally assessed TALEN-pair- 
binding specificity as a function of the TALE-repeat- 
binding sequence length. To this end, we sampled 500 
TALEN-L/R pairs (TALEN target sites) of a given 
TALE-repeat length uniformly from the euchromatin or 
heterochromatin of each chromosomal arm of 
D. melanogaster. Subsequently, a set of possible target 
sites by TALEN-L/R pair and TALEN-L or -R self- 
pairing were constructed and aligned to D. melanogaster 
reference genome (Figure 1A). A TALEN target site was 
considered to be specific if only the initial sequence 
uniquely aligns to its origin and all other constructed 
sequences fail to align. The in silico analysis revealed 
that TALE-repeats longer than 12 bp reached a plateau 



5 7 9 11 13 15 17 19 bp 
Length of single TALEN recognition sequence 

Figure 1. Target specificity of TALEN-pairs as a function of the 
binding sequence length. (A) The overview of in silico analysis. Five 
hundred TALEN-pair target sites were sampled uniformly from 
euchromatic or heterochromatic regions of the different chromosomal 
arms. The specificity of a TALEN-L/R pair in each sample was 
determined by aligning all the potential targeting sites to 
D. melanogaster reference genome (BDGP Release 5 dm3). Specific 
TALEN-pairs contain exactly one targeting site within the genome, 
and hence, TALEN-pairs exhibiting multiple targeting sequences are 
considered to be unspecific. (B) The probability of specific targeting 
was calculated as the number of specific TALEN pairs divided by the 
number of samples. Results obtained for different chromosome arms 
were summarized using a boxplot representation. To assess the contri- 
bution of genomic repeat regions (e.g. simple repeats or transposable 
elements) to the fraction of unspecific TALEN pairs, the specificity was 
estimated using only samples localizing outside of annotated repeat 
regions within the euchromatin. 



in TALEN-pair-binding specificity in the euchromatic 
regions of the genome (Euchromatin in Figure IB). To 
assess the contribution of genomic repeat regions to un- 
specific TALEN-binding sites, we subtracted the samples 
derived from repeat regions such as simple repeats or 
transposable elements on the euchromatin [Euchromatin 
(non-repetitive) in Figure IB]. On the other hand, more 
than half of TALEN-pairs sampled from heterochromatic 
regions still exhibited off-targets even with TALE-repeat 
length of 20 bp (Heterochromatin in Figure IB). These 
results suggested that TALEN-pairs of TALE-repeat 
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Figure 2. Construction of TALENs with the easyT protocol. (A) A 
schematic representation of a TALEN with a TALE-repeat length of 
18.5 modules. The TALE-repeat is assembled from 20 monomer units. 
The boundaries of monomer units were shifted from those of the 
TALE-repeat modules. (B) Overview of TALEN cloning. In the first 
step, four units are assembled into 4-mers in a 'digation' reaction. In 
the second step, 4-mers are PCR- amplified, run on an agarose gel, gel- 
extracted and concentrated. Finally, 4-mers were assembled into the 
TALEN backbone plasmid in the second digation reaction. Yellow 
and blue arrows indicate primers used for 4-mer amplification. 



longer than 12 bp are required for specific targeting within 
non-repetitive euchromatic regions of D. melanogaster 
genome. Accordingly, we constructed the TALEN-pairs 
recognizing sequences >15bp (Table 1). 

Development of the easyT protocol 

A variety of TALEN assembly protocols are now avail- 
able (39). However, these protocols rely on extensive 
plasmid libraries, special reagents or extended cloning 
schemes requiring many days of work (22,33,40-44). We 
developed a new easyT protocol that enables us to con- 
struct TALENs of up to 18.5 TAL-repeat modules in a 
single cloning process (Figure 2, Supplementary Figures 
S1-S5 and Supplementary Methods). Importantly, the 
activity of the restriction enzyme Bae I at 25°C allowed 
combining DNA digestion and ligation into a single 1-h 
'digation' reaction, thereby saving time and effort 
(Figure 2B). In a first step, individual units were assembled 
into 4-mers in a first digation reaction. DNA sequence of 
each unit within a 4-mer has been altered to reduce 
nucleotide sequence homology without changing amino 
acid sequence. This enabled efficient PCR-amplification 
of 4-mers regardless of combination of four units 
(Supplementary Figure S2). In a subsequent second 
digation, a TALE-repeat was assembled into the pre- 
digested TALEN backbone plasmid and followed by 



conventional cloning procedure. Thus, construction of 
custom TALENs can be achieved in a single day. 
Besides the simplicity of the cloning procedures, the 
easyT prepares a unique troubleshooting protocol. In 
the regular Golden Gate based TALEN assembly proto- 
cols (39), the assembled units are no longer replaceable. By 
shifting the boundaries of assembly units from those of 
TALE-repeat modules, however, we made it possible to 
embed unique restriction sites at the borders of every 
4-mer units, enabling users to replace any 4-mer in case 
there is a mutation by PCR-error or misligation 
(Supplementary Figure S3). 

Targeted mutagenesis in Drosophila germ line by 
microinjection of TALEN expression plasmids 

The microinjection of plasmid DNA into Drosophila 
embryos has been routinely used to generate transgenic 
flies. Co-injection of P-element-based transformation 
vector with a transposase-expressing helper plasmid can 
efficiently integrate the transgenes into the genome of 
germ cell lineage (45). Additionally, the use of plasmid 
DNA for TALEN introduction eliminates an mRNA syn- 
thesis step. Hence, the TALEN plasmids synthesized via 
easyT were used directly in embryo microinjections. We 
first tested whether TALEN plasmid DNA microinjection 
can be used to introduce DSB in Drosophila embryos by 
targeting two genes, yellow (y) and white (w), mutations of 
which result in a visible phenotype. Flies carrying a 
knockout of y display a characteristic yellow body pheno- 
type, while mutation of w results in reduced or completely 
lost red eye pigmentation. As shown in Table 2, a series of 
different TALEN plasmid concentrations were tested for y 
targeting, with a peak [27.0% (24/89)] in mutation fre- 
quency observed at 50ng/|il concentration for each 
TALEN plasmid. Interestingly, the mutation frequency 
was consistently higher in males than in females at all 
concentrations tested. For w targeting, 50ng/|il and 
100g/|il concentrations were tested, yielding germ line w 
mutants with 7.9% (3/38) and 0.0% (0/118) frequencies, 
respectively (Table 2). The optimal TALEN concentration 
is likely to be a function of the TALEN pair affinity to its 
recognition sequence, the chromatin status of the target 
locus, and potential off-target sites in the genome and thus 
might need to be adjusted for individual TALEN pairs. 

The targeted mutations were determined by sequencing 
genomic TALEN target region. For the y mutants 
generated, randomly selected 10 independent mutant 
lines had a mutation by NHEJ-induced indels at the 
target site (Figure 3 A, C and D). Interestingly, targeting 
of the w gene resulted in a distinctive genomic lesion in all 
three mutant lines derived from individual microinjected 
embryos (Figure 3B). The deletion of nine nucleotides 
from the coding sequence caused reduced red eye pigmen- 
tation in mutants (Figure 3E and F) and could be due to 
an accidental 5-bp microhomology in the spacer sequence 
between two TALEN-binding sites. In summary, TALEN 
plasmid microinjection is a convenient alternative to 
mRNA microinjection for germ line mutagenesis in 
Drosophila embryos. 
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Figure 3. Targeted germ line gene mutagenesis. (A) TALEN-induced 
mutations in y. The target site was chosen at the junction of exon 2 and 
intron, as indicated with the scissors mark. Ten y mutants (y TALEN ) 
derived individual F 0 were sequenced. (B) TALEN-induced mutations 
in w. The target site encodes the end part of the ATP-binding cassette 
transporter domain. Three independently generated Fi mutants have 
the same genomic lesions of a 9-bp deletion. The sequence 
microhomology shown with asterisk at the target site indicates the 
probable DSB repair through microhomology-mediated end-joining. 
(C and D) The body colour phenotypes of wild type and y TALEN 
flies, both male, are shown. (E and F) The eye colour phenotypes of 
wild type and w TALEN flies are shown. w TALEN is a hypomorphic allele, 
showing an orange eye colour. In the TALEN target sequences, the 
DNA aberrations are shown in red and TALEN-binding sites in blue. 
The intronic and exonic sequences are shown in lowercase and 
uppercase, respectively. Scale bar represents 1 kb length. 



TALEN-mediated targeted gene integration by HR in 
Drosophila germ line 

Next, we investigated TALEN-mediated targeted gene in- 
tegration via HR between the target locus and an exogen- 
ously provided donor DNA. To expand the TALEN 
application to any region in the genome, we constructed 
two TALEN-pairs targeting the intergenic region of 
unpaired (upd) and the wingless (wg) locus, respectively 
(Table 1). To that end, three donor DNA plasmids were 
constructed to contain an eye-specific enhanced green 
fluorescent protein marker gene [3xP3-EGFP (35)] 
flanked on either side by homology arms (Figure 4A-C). 
Insertion of the marker gene in y donor DNA disrupts the 
coding sequence of the y gene. In the other two cases, the 
insertion results in a deletion of a putative AP-1 -binding 
sequence in the upd and wg donor DNA. Co-injection of 
donor DNA plasmids (lOOng/pl) along with TALEN 
plasmids (50ng/|il each) was carried out in a DNA 



ligase IV (lig4) mutant background (36), shown previously 
to have higher efficiency of donor DNA integration via 
HR (30). 

The flies carrying germ line integrations of the marker 
gene were identified by green fluorescence in eyes of the 
progeny of microinjected embryos (Figure 4D-F). 
The fluorescence level and pattern were different for the 
three genomic locations tested, probably reflecting the in- 
fluence of surrounding DNA regulatory elements on the 
transcriptional activity of the marker gene. The triple- 
plasmid microinjection resulted in a successful introduc- 
tion of germ line transmitting mutations at each locus: 
1.5% (2/130) for y, 10.5% (16/152) for upd and 2.2% 
(4/186) for wg (Table 3). Using the best TALEN-pair tar- 
geting upd, we examined the frequency of HR event with 
higher concentrations (500 ng/jil) of donor DNA plasmids 
(Table 3). Unexpectedly, increasing donor DNA concen- 
tration resulted in a decrease of frequency of HR- 
mediated gene integration (4.0%, 12/297). Genomic 
DNA of EGFP-positive flies was amplified to confirm 
targeted gene integration (Figure 4G-J). Amplicons of 
upd TALEN - EGFP and wg TALEN - EGFP were also sequenced 
to verify that they indeed represent TALEN-mediated 
targeted gene integration (data not shown). These results 
demonstrate that targeted mutations via gene integration 
can be efficiently induced in Drosophila and easily scored 
with a traceable recombination marker. 



DISCUSSION 

Introduction of site-specific genetic modifications in a 
controlled manner has been recognized as an ultimate 
approach in biosciences to understand the function of a 
gene product or gene regulation. The development of the 
customizable engineer nucleases such as ZFNs and, more 
recently, TALENs made a leap forward for targeted 
genome editing in a variety of organisms. In this study, 
we developed a new TALEN synthesis method named 
easyT. This protocol allows construction of custom 
TALENs from monomer units in a single day and relies 
only on standard techniques commonly used in a molecu- 
lar biology lab. Other assembly methods that permit con- 
struction of TALENs in a single day require either solid 
phase high-throughput ligation technology or a library of 
preassembled 4- or 5-mer unit plasmids (40,43,44). 
Furthermore, an additional feature of easyT is the cap- 
ability of replacing a part of assembled units every 4-mers. 
The time to synthesize a TALEN by a given protocol relies 
on the successful cloning. Many of the currently available 
TALEN assembly protocols are based on either plasmid- 
or PCR-based Golden Gate cloning (39), where the 
correct assembly is confirmed at the end by sequencing. 
At this point, mutations introduced in TALE-repeat 
cannot be fixed, and either more clones are sequenced or 
the assembly procedure has to be repeated. Therefore, the 
ability to replace a part of TALE-repeats is a valuable 
troubleshooting feature of easyT. 

TALEN technology has been quickly and successfully 
applied to a wide variety of cells and organisms in the past 
years. To evaluate the utilities of TALENs in each model 
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Figure 4. TALEN-mediated targeted marker gene integration. (A-C) Schematics of TALEN target loci and corresponding donor DNA. TALEN 
target sites are indicated with scissors. Scale bar in each panel represents 1 kb length. (D-F) Mutant progeny of microinjected embryos with 
TALEN-mediated targeted 3xP3-EGFP integration, y talen-egfp fly also §how the expected yellow body colour ( D ). U pd TALEN ^ GFP and 
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Red and blue arrows indicate primers used for genomic DNA amplification. 



organism, most of the pioneering work has scored the 
frequency of NHEJ-mediated targeted mutations using 
the genes that express known visible pheno types. In the 
practical use of TALEN technology, however, it is often 
required to target loci in which resulting phenotypes are 
not known. Because the frequencies of mutant production 
are generally not high enough in Drosophila (18,21,30), the 
identification of mutants could be the key step in 
Drosophila genome engineering. In this work, we 
demonstrated that TALEN-directed targeted genome 
alteration through HR with exogenous donor DNA 
could be an efficient and straight approach for mutant 
generation and identification. In this regard, we demon- 
strate the utility of a traceable recombination marker 
introduced by the donor DNA. First, it allows identifica- 
tion of germ line transmitting mutants as early as the 
progeny of injected embryos start to hatch. Second, it 
enables tracking of the mutated alleles for generating 
hetero- or homozygous stable lines for subsequent experi- 
ments. And finally, the easy identification allows more 
flexibility in the mode of TALEN introduction into 
Drosophila embryos, and we showed that TALEN 
plasmid microinjection is a convenient alternative to the 
mRNA microinjection in Drosophila. As we speculate 
from the limited number of examples in this work and 
previous studies (18,21), the simple identification of 
mutants by traceable markers will not be substantially 
improved by the potential of a slight increase in mutant 
frequencies, thus shifting the focus from TALEN 
efficiency to the overall convenience of the method. For 
example, the use of plasmid DNA for TALEN 



introduction eliminates additional mRNA synthesis steps 
and possible concerns on the quality of TALEN mRNA 
during microinjection. Taken together, HR-mediated gene 
integration with donor DNA bearing a traceable marker 
appears as a more favourable and practical strategy. 

The utility of TALEN-mediated targeted gene 
integration can be extended to precise genome engineering 
by combining TALEN-mediated targeted gene integration 
with site-specific recombinase/integrase systems such as 
flippase (Flp)-FRT, Cre-loxP or PhiC31 (46), and the 
piggyBac transposon system (47). For example, by using 
the 3xP3-EGFP cassette having flp/Cre recognition 
sites besides the 3xP3 promoter, the 3xP3 can be 
removed by appropriate recombinases, while leaving 
EGFP at the target site (Supplementary Figure S6). In 
this way, it is now possible to tag endogenous gene 
products with fluorescence protein regardless of their ex- 
pression levels. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online, 
including [48]. 
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